type variable
Evaluating Program Semantics Reasoning with Type Inference in System F
He, Yifeng, Yang, Luning, Gonzalo, Christopher Castro Gaw, Chen, Hao
Large Language Models (LLMs) are increasingly integrated into the software engineering ecosystem. Their test-time compute (TTC) reasoning capabilities show significant potential for understanding program logic and semantics beyond mere token recognition. However, current benchmarks for code reasoning lack a formal, program-centric deductive framework to ensure sound evaluation, and are incapable of assessing whether models genuinely reason about program semantics or merely exploit superficial associations between natural language and code tokens. To bridge this gap, we introduce TF-Bench, a benchmark designed to evaluate LLM reasoning based on type inference in System F, a task we refer to as program semantics reasoning. By employing verified transformations to remove semantically irrelevant natural language, we construct TF-Bench_pure, a purely semantics-driven variant of TF-Bench. Our analysis reveals substantial limitations in state-of-the-art LLMs, with the best-performing LLM (Claude-3.7-sonnet) achieving only 55.85% accuracy on TF-Bench_pure. Additionally, we propose two novel metrics to assess robustness and the effectiveness of test-time reasoning, underscoring critical limitations in current LLM capabilities and highlighting essential directions for future research.
Learning functional programs with function invention and reuse
Inductive programming (IP) is a field whose main goal is synthesising programs that respect a set of examples, given some form of background knowledge. This paper is concerned with a subfield of IP, inductive functional programming (IFP). We explore the idea of generating modular functional programs, and how those allow for function reuse, with the aim to reduce the size of the programs. We introduce two algorithms that attempt to solve the problem and explore type based pruning techniques in the context of modular programs. By experimenting with the implementation of one of those algorithms, we show reuse is important (if not crucial) for a variety of problems and distinguished two broad classes of programs that will generally benefit from function reuse.
Type-driven Neural Programming by Example
In this thesis we look into programming by example (PBE), which is about finding a program mapping given inputs to given outputs. PBE has traditionally seen a split between formal versus neural approaches, where formal approaches typically involve deductive techniques such as SAT solvers and types, while the neural approaches involve training on sample input-outputs with their corresponding program, typically using sequence-based machine learning techniques such as LSTMs [41]. As a result of this split, programming types had yet to be used in neural program synthesis techniques. We propose a way to incorporate programming types into a neural program synthesis approach for PBE. We introduce the Typed Neuro-Symbolic Program Synthesis (TNSPS) method based on this idea, and test it in the functional programming context to empirically verify type information may help improve generalization in neural synthesizers on limited-size datasets. Our TNSPS model builds upon the existing Neuro-Symbolic Program Synthesis (NSPS), a tree-based neural synthesizer combining info from input-output examples plus the current program, by further exposing information on types of those input-output examples, of the grammar production rules, as well as of the hole that we wish to expand in the program. We further explain how we generated a dataset within our domain, which uses a limited subset of Haskell as the synthesis language. Finally we discuss several topics of interest that may help take these ideas further. For reproducibility, we release our code publicly.
𝐖ell-𝚻yped 𝐑eflections
As one can observe, the basic MLsub type inference algorithm is actually quite similar to the traditional Algorithm J for HM-style type inference, and is not complicated at all! First, note that type variable bounds, which are updated using mutable variables, may very well form cycles as type inference progresses.
LambdaNet: Probabilistic Type Inference using Graph Neural Networks
Wei, Jiayi, Goyal, Maruth, Durrett, Greg, Dillig, Isil
As gradual typing becomes increasingly popular in languages like Python and TypeScript, there is a growing need to infer type annotations automatically. While type annotations help with tasks like code completion and static error catching, these annotations cannot be fully determined by compilers and are tedious to annotate by hand. This paper proposes a probabilistic type inference scheme for TypeScript based on a graph neural network. Our approach first uses lightweight source code analysis to generate a program abstraction called a type dependency graph, which links type variables with logical constraints as well as name and usage information. Given this program abstraction, we then use a graph neural network to propagate information between related type variables and eventually make type predictions. Our neural architecture can predict both standard types, like number or string, as well as user-defined types that have not been encountered during training. Our experimental results show that our approach outperforms prior work in this space by $14\%$ (absolute) on library types, while having the ability to make type predictions that are out of scope for existing techniques.
Compositional Model Repositories via Dynamic Constraint Satisfaction with Order-of-Magnitude Preferences
The predominant knowledge-based approach to automated model construction, compositional modelling, employs a set of models of particular functional components. Its inference mechanism takes a scenario describing the constituent interacting components of a system and translates it into a useful mathematical model. This paper presents a novel compositional modelling approach aimed at building model repositories. It furthers the field in two respects. Firstly, it expands the application domain of compositional modelling to systems that can not be easily described in terms of interacting functional components, such as ecological systems. Secondly, it enables the incorporation of user preferences into the model selection process. These features are achieved by casting the compositional modelling problem as an activity-based dynamic preference constraint satisfaction problem, where the dynamic constraints describe the restrictions imposed over the composition of partial models and the preferences correspond to those of the user of the automated modeller. In addition, the preference levels are represented through the use of symbolic values that differ in orders of magnitude.